Ensemble of online neural networks for non-stationary and imbalanced data streams

نویسندگان

  • Adel Ghazikhani
  • Reza Monsefi
  • Hadi Sadoghi Yazdi
چکیده

Concept drift (non-stationarity) and class imbalance are two important challenges for supervised classifiers. “Concept drift” (or non-stationarity) refers to changes in the underlying function being learnt, and class imbalance is a vast difference between the numbers of instances in different classes of data. Class imbalance is an obstacle for the efficiency of most classifiers. Research on classification of nonstationary and imbalanced data streams, mainly focuses on batch solutions, whereas online methods are more appropriate. Here, we propose an online ensemble of neural network (NN) classifiers. Ensemble models are the most frequent methods used for classifying non-stationary and imbalanced data streams. The main contribution is a two-layer approach for handling class imbalance and non-stationarity. In the first layer, cost-sensitive learning is embedded into the training phase of the NNs, and in the second layer a new method for weighting classifiers of the ensemble is proposed. The proposed method is evaluated on 3 synthetic and 8 real-world datasets. The results show statistically significant improvement compared to online ensemble methods with similar features. & 2013 Elsevier B.V. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Recursive least square perceptron model for non-stationary and imbalanced data stream classification

Classifying non-stationary and imbalanced data streams encompasses two important challenges, namely concept drift and class imbalance. ‘‘Concept drift’’ (or nonstationarity) is changes in the underlying function being learnt, and class imbalance is vast difference between the numbers of instances in different classes of data. Class imbalance is an obstacle for the efficiency of most classifiers...

متن کامل

Learning Framework for Non-stationary and Imbalanced Data Stream

Abstract—Although learning on non-stationary data and imbalanced data have been extensively studied in the literature separately, however little work has been done to tackle the imbalanced issue on nonstationary data stream as the joint probability distribution between the data and classes changes with time and may results skewed class distribution. Especially in airlines delay detection, data ...

متن کامل

A Dynamic Ensemble Framework for Mining Textual Streams with Class Imbalance

Textual stream classification has become a realistic and challenging issue since large-scale, high-dimensional, and non-stationary streams with class imbalance have been widely used in various real-life applications. According to the characters of textual streams, it is technically difficult to deal with the classification of textual stream, especially in imbalanced environment. In this paper, ...

متن کامل

Ensemble learning for data stream analysis: A survey

In many applications of information systems learning algorithms have to act in dynamic environments where data are collected in the form of transient data streams. Compared to static data mining, processing streams imposes new computational requirements for algorithms to incrementally process incoming examples while using limited memory and time. Furthermore, due to the non-stationary character...

متن کامل

Parallel Online Continuous Arcing with a Mixture of Neural Networks

This paper presents a new arcing (boosting) algorithm called POCA, Parallel Online Continuous Arcing. Unlike traditional arcing algorithms (such as Adaboost), which construct an ensemble by adding and training weak learners sequentially on a round-byround basis, training in POCA is performed over an entire ensemble continuously and in parallel. Since members of the ensemble are not frozen after...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Neurocomputing

دوره 122  شماره 

صفحات  -

تاریخ انتشار 2013